g02gcf

g02gcf © Numerical Algorithms Group, 2002.

Purpose

G02GCF Fits a generalized linear model with Poisson errors

Synopsis

[dev,idf,b,irank,se,cov,v,ifail] = g02gcf(x,isx,y<,link,a,wt,mean,offset,v,...
tol,maxit,iprint,epslon,weight,ifail>)

Description

 
 A generalized linear model with Poisson errors consists of the 
 following elements:
 
 (a)   a set of n observations, y , from a Poisson distribution:
                                 i                 
                                   y -(mu)
                               (mu) e 
                               -----------
                                   y!
 
 (b)   X, a set of p independent variables for each observation, 
       x ,x ,...,x .
        1  2      p
 
 (c)   a linear model:
                                  --         
                           (eta)= > (beta) x .
                                  --      j j
 
 (d)   a link between the linear predictor, (eta), and the mean of
       the distribution, (mu), (eta)=g((mu)). The possible links 
       functions are:
 
                                   a                  
      (i) exponent link: (eta)=(mu) , for a constant a,
 
      (ii) identity link: (eta)=(mu),
 
      (iii) log link: (eta)=log(mu),
 
                                     ____
      (iv) square root link: (eta)=\/(mu),
 
                                   1  
      (v) reciprocal link: (eta)= ----.
                                  (mu)
 
 (e)   a measure of fit, the deviance:
             n                 n   {     (  y   )           }
             --        ^^^^    --  {     (   i  )     ^^^^  }
             >  dev(y ,(mu) )= >  2{y log( -----)-(y -(mu) )}
             --      i     i   --  { i   ( ^^^^ )   i     i }
             i=1               i=1 {     ( (mu) )           }
                                   {     (     i)           }
       The linear parameters are estimated by iterative weighted 
       least-squares. An adjusted dependent variable, z, is 
       formed:
                                          d(eta)
                         z=(eta)+(y-(mu)) ------
                                          d(mu)
       and a working weight, w,
                             (       d(eta))2
                           w=((tau)d ------) 
                             (       d(mu) )
                     ____
       where (tau)=\/(mu).
       
       At each iteration an approximation to the estimate of 
               ^^^^^^                                        
       (beta), (beta), is found by the weighted least-squares 
       regression of z on X with weights w.
 
                                     1/2          1/2            
 G02GCF finds a QR decomposition of w   X, i.e., w   X=QR where R 
 is a p by p triangular matrix and Q is an n by p column 
 orthogonal matrix.
 
                            ^^^^^^                 
 If R is of full rank, then (beta) is the solution to:
 
                           ^^^^^^  T 1/2
                          R(beta)=Q w   z
 
 If R is not of full rank a solution is obtained by means of a 
 singular value decomposition (SVD) of R.
 
                               (D 0) T
                           R=Q (0 0)P ,
                              *       
 
 where D is a k by k diagonal matrix with non-zero diagonal 
                                      1/2 
 elements, k being the rank of R and w   X.
 
 This gives the solution
 
                                 (Q  0)      
                     ^^^^^^    -1( *  ) T 1/2
                     (beta)=P D  (0 I )Q w   z
                             1               
 
 P  being the first k columns of P, i.e., P=(P P ).
  1                                           1 0 
 
 The iterations are continued until there is only a small change 
 in the deviance.
 
 The initial values for the algorithm are obtained by taking
 
                            ^^^^^
                            (eta)=g(y)
 
 The fit of the model can be assessed by examining and testing the
 deviance, in particular by comparing the difference in deviance 
 between nested models, i.e., when one model is a sub-model of the
 other. The difference in deviance between two nested models has, 
                       2                                     
 asymptotically, a (chi)  distribution with degress of freedom 
 given by the difference in the degrees of freedom associated with
 the two deviances.
 
                           ^^^^^^                             
 The parameters estimates, (beta), are asymptotically Normally 
 distributed with variance-covariance matrix:
              
              T                        
         -1 -1                         
      C=R  R    in the full rank case, otherwise
 
           -2 T
      C=P D  P 
         1    1
 
 The residuals and influence statistics can also be examined.
 
                                ^^^^^  ^^^^^^                   
 The estimated linear predictor (eta)=X(beta), can be written as 
   1/2                                                          
 Hw   z for an n by n matrix H. The ith diagonal elements of H, 
 h , give a measure of the influence of the ith values of the 
  i
 independent variables on the fitted regression model. These are 
 known as leverages.
 
                                ^^^^  -1 ^^^^^ 
 The fitted values are given by (mu)=g  ((eta)).
 
 G02GCF also computes the deviance residuals, r:
 
                                     _____________
                           ^^^^     /       ^^^^  
                r =sign(y -(mu) )  / dev(y ,(mu) ).
                 i       i     i \/       i     i 
 
 An option allows prior weights to be used with the model.
 
 In many linear regression models the first term is taken as a 
 mean term or an intercept, i.e., x    = 1, for i=1,2,...,n. This 
                                   i,1                           
 is provided as an option.
 
 Often only some of the possible independent variables are 
 included in a model; the facility to select variables to be 
 included in the model is provided.
 
 If part of the linear predictor can be represented by a variables
 with a known coefficient then this can be included in the model 
 by using an offset, o:
 
                                --         
                       (eta)=o+ > (beta) x .
                                --      j j
 
 If the model is not of full rank the solution given will be only 
 one of the possible solutions. Only certain linear combimations 
 of the parameters will have unique estimates, these are known as 
 estimable functions.
 
 Details of the SVD, are made available, in the form of the matrix
  *
 P :
 
                               ( -1 T)
                               (D  P )
                               (    1)
                             * (  T  )
                            P =( P   ).
                               (  0  )

Parameters

g02gcf

Required Input Arguments:

x (:,:)                               real
isx (:)                               integer
y (:)                                 real

Optional Input Arguments:                       <Default>

link (1)                              string   'l'
a                                     real     0.0
wt (:)                                real     zeros(length(y),1)
mean (1)                              string   'm'
offset (1)                            string   'n'
v (:,:)                               real     zeros(length(y),ip+7)
tol                                   real     10*eps
maxit                                 integer  10
iprint                                integer  -1
epslon                                real     eps
weight (1)                            string   g02gcf04(wt)
ifail                                 integer  -1

Output Arguments:

dev                                   real
idf                                   integer
b (:)                                 real
irank                                 integer
se (:)                                real
cov (:)                               real
v (:,:)                               real
ifail                                 integer